1Department of Agronomy, Iowa State University, Ames, IA, USA
2Department of Electrical Engineering, Iowa State University, Ames, IA, USA
3Department of Mechanical Engineering, Iowa State University, Ames, IA, USA
Received 05 May 2019 |
Accepted 06 Jul 2019 |
Published 28 Jul 2019 |
The rate of advancement made in phenomic-assisted breeding methodologies has lagged those of genomic-assisted techniques, which is now a critical component of mainstream cultivar development pipelines. However, advancements made in phenotyping technologies have empowered plant scientists with affordable high-dimensional datasets to optimize the operational efficiencies of breeding programs. Phenomic and seed yield data was collected across six environments for a panel of 292 soybean accessions with varying genetic improvements. Random forest, a machine learning (ML) algorithm, was used to map complex relationships between phenomic traits and seed yield and prediction performance assessed using two cross-validation (CV) scenarios consistent with breeding challenges. To develop a prescriptive sensor package for future high-throughput phenotyping deployment to meet breeding objectives, feature importance in tandem with a genetic algorithm (GA) technique allowed selection of a subset of phenotypic traits, specifically optimal wavebands. The results illuminated the capability of fusing ML and optimization techniques to identify a suite of in-season phenomic traits that will allow breeding programs to decrease the dependence on resource-intensive end-season phenotyping (e.g., seed yield harvest). While we illustrate with soybean, this study establishes a template for deploying multitrait phenomic prediction that is easily amendable to any crop species and any breeding objective.